Learning objectives
Referring to knowledge
Get acquainted with fundamental results of probability theory and statistics. Understand their relevance to important issues in experimental and theoretical physics.
Understand the power and limitations of Monte-Carlo methods, in particular when applied to physical contexts
Develop comprehensive skills on the topic, ranging from the ability to write code to perform computations on specific data to the ability to prove easy mathematical statements in order to solve theoretical issues.
Get acquainted with the techniques for data analysis and the basic concepts of data mining. Specifically, to code in Python to implement the analysis and to use a variety of software tools for data mining, including Neural Networks.
Teaching blocks
1. The concept of probability
1.1. Conditional probability and Bayes theorem
1.2. Frequentists versus Bayesians
2. Random variables
2.1. Mean, variance and moments
2.2. Change of variables
2.3. Examples of one-dimensional p.d.f’s
2.4. Distributions of more than one random variable
2.5. Examples of n-dimensional p.d.f’s
2.6. Reproducibility
2.7. Some theorems of probability theory
3. Monte Carlo
3.1. Random generation of uniform numbers
3.2. Generation of different p.d.f.
3.3. The inverse transformation method
3.4. The composition method
3.5. Von Neumann’s method
3.6. Stratified sampling method
3.7. Events with weight
3.8. Monte Carlo integration
3.9. Markov chains
4. Statistical inference
4.1. Non-parametric estimation
4.2. Parametric estimation
4.3. Confidence intervals
4.4. Fisher Information
4.5. Sufficient statistics
4.6. Cramer-Rao inequality
4.7. Construction of estimators
4.8. The maximum likelihood method
4.9. The minimum chi2 method
5. Statistical tests
5.1. Hypothesis test
5.2. Significance test
5.3. Decision theory
6. Advanced topics
6.1. Feldman-Cousins criterion for confidence intervals
6.2. The sPlot method
6.3. The sFit method
7. Multivariate analysis and statistical treatment techniques
7.1. Introduction to multivariate data analysis
7.2. Data analysis and representation; Statistical distances.
7.3. Principal component analysis
7.4. Clustering
7.5. Discriminant analysis
7.6. Non-parametric methods of estimation of a probability density function
7.7. Hands-on exercises
8. Neural Networks
8.1. Basic concepts of Artificial Neural Networks
8.2. Design, training and use of Neural Networks
8.3. Self Organizing maps
8.4. Hands-on exercises
9. Data mining
9.1. Introduction to data mining: basic concepts
9.2. Combination of data analysis techniques to implement a data mining procedure
9.3. Complementary topics: Big data, artificial intelligence, cloud computing
Official assessment of learning outcomes
There is no exam for this subject. Instead, 6 problem-solving assignments are set during the course. Grading is based on the assessment of the reports submitted.
Examination-based assessment
Repeat assessment: students have to repeat and resubmit the 6 problem-solving assignments following the instructions from the lecturers. Once the assignments have been assessed, students take an oral exam on their contents. If this exam is successfully passed, the final grade is calculated from the marks of the assignments; otherwise, the subject is graded as failed.
Reading and study resources
Check availability in Cercabib
Book
DeGroot, Morris H. Probability and statistics. 4th ed. Boston : Pearson Education, cop. 2012 Enllaç
2a ed Enllaç
Feller, William. An introduction to probability theory and Its applications, 2nd ed. New York : Wiley, 1972. v. 2 Enllaç
https://cercabib.ub.edu/discovery/search?vid=34CSUC_UB:VU1&search_scope=MyInst_and_CI&query=any,contains,b1536375* Enllaç
Witten, I. H. ; Frank, Eibe ; Hall, Mark A. Data mining : a practical machine learning tools. 4th ed. Burlington, [etc.] : Morgan Kaufman, cop. 2017 Enllaç
https://cercabib.ub.edu/discovery/search?vid=34CSUC_UB:VU1&search_scope=MyInst_and_CI&query=any,contains,b1727639* Enllaç
Landau, David P ; Binder, K. A guide to Monte Carlo simulations in statistical physics. 4a ed. Cambridge : Cambridge University Press, cop. 2015 Enllaç
Data Mining: Practical Machine Learning Tools and Techniques; Ian H. , Witten, Eibe Frank, Mark A. Hall, Christopher Pal; Ed. Morgan Kauffmann, ISBN 978-0128042915
Video, DVD and film
Neural Networks: Zero to Hero: youtube series on Neural Networks
https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ Enllaç
Article
Weinzierl, Stephan. "Introduction to Monte Carlo method", a: http://arxiv.org/abs/hep-ph/0006269 Enllaç
Conferences
Web page
Scientific computing tools for Python: https://www.scipy.org/about.html
Introduction to Probability for Data Science: https://probability4datascience.com/
More information at: http://grad.ub.edu/grad3/plae/AccesInformePDInfes?curs=2023&assig=568423&ens=M0D0B&recurs=pladocent&n2=1&idioma=ENG